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(1) Real Party in Interest 

A statement identifying by name the real party in interest is 
contained in the brief. 

(2) Related Appeals and Interferences 

The examiner is not aware of any related appeals, interferences, 
or judicial proceedings which will directly affect or be directly 
affected by or have a bearing on the Board' s decision in the pending 
appeal . 

(3) Status of Claims 

The statement of the status of claims contained in the brief is 
correct . 

(4) Status of Amendments After Final 

No amendment after final has been filed. 

(5) Summary of Claimed Subject Matter 

The summary of claimed subject matter contained in the brief is 
correct . 

(6) Grounds of Rejection to be Reviewed on Appeal 

The appellant's statement of the grounds of rejection to be 
reviewed on appeal is correct. 

(7) Claims Appendix 

The copy of the appealed claims contained in the Appendix to the 
brief is correct. 

(8) Evidence Relied Upon 

5,6,75,708 FITZPATRICK 10-1997 

US 2001/0027396 Al SATO 10-2001 



Application/Control Number: 09/752,611 
Art Unit: 2644 



Page 3 



(9) Grounds of Rejection 

The following ground (s) of rejection are applicable to the 
appealed claims: 

(I) Claims 1-2, 5-8, 10—12, 15-17, and 20 are rejected under 35 
U.S.C. 103(a) as being unpatentable over Sato (US 2001/0027396 Al) in 
view of Fitzpatrick et al (USPN 5675708). Hereafter, "Fitzpatrick et 
al" will be referred to as "Fitzpatrick". 

Sato discloses the audible synthesis an emission of data related 
to an audio file, relative to the playing of the audio file. The data 
involves information about the audio file ranging from the title to 
the type of the music (page 3, para. 0065, and Figure 90) . The data 
is passed through a voice synthesizer (23) to convert the data into an 
audible output compatible format and the data is output in various 
forms of in synchronism with the audio file, ranging from the start or 
end of the audio file to a detected volume condition of the file 
(para. 0053, 0074, 0075) . Regarding Claim 1, the selection of the 
relevant audio data with the extraction unit (21) for the voice 
synthesizer (23) reads on "reading descriptive information about an 
audio file from meta-data for the audio file" (para. 0061). The 
synchronism between the playing of the audio file and the audio data 
from the synthesizer reads on the concept of "concatenating at least a 
portion of an audio format of the descriptive information". However, 
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the data from the synthesizer (23) is passed through a D/A converter 
before it is chronologically associated with the audio data of the 
audio file. 

Thus, Sato does not clearly specify: 

that the concatenating of the at least a portion of the audio 
format of the descriptive information is executed to an audio 
file 

Fitzpatrick discloses a system for converting various forms of 
multimedia data into audio media. The process involves the inputting 
of a file or multimedia data stream (col. 3, lines 17-22) . The 
process involves aligning entities from a file on a modified output 
file (col. 3, lines 66-67 and col. 4, lines 1-6). Entities include 
text word or phrases that may be converted to a spoken word, as well 
as audio elements (col. 3, lines 43-46 and 57-61). The entity that is 
written to output file is the associated digitized audio format of the 
entity (col. 4, lines 1-2) . Fitzpatrick also discloses a process for 
providing an audio equivalent for data that does not have a standard, 
discernable equivalent (col. 4, lines 8-34). The concept of writing 
multiple digital audio entities to a file, in view of the effective 
signal composition of Sato, reads on "concatenating the descriptive at 
least a portion of an audio format of the descriptive information to 
the audio file". 

To one of ordinary skill in the art at the time the invention was 
made, it would have been obvious been obvious to perform the signal 
combination of Sato in the digital domain though a method such as the 
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subsequent writing of entities as disclosed by Fitzpatrick. The 
motivation behind such a modification would have been that such 
digital processing would have not required hardware capable of 
efficient processing for the real time production of output. 

Regarding Claim 2, the voice synthesizer (23) of Sato converts 
the text information to voice data, which is provided through D/A 
converters (13a, 13b) to be emitted by a loudspeaker, the functions of 
the synthesizer reading on "converting the descriptive information to 
the audio format prior to concatenating" (para. 0059) . Fitzpatrick 
also notes certain text data as convertible to a spoken phrase (col. 
3, lines 57-61 ) . 

Regarding Claim 5, one embodiment of Sato involves deriving the 
data information from the ID3 tag of an MPEG-1 Layer 3 format, which 
reads on "the audio file comprises the metadata" (para. 0065). Sato 
also notes that such data can be shown on a device with a text 
display, and that the disclosed combination may be executed on a 
device with a display, which provides support for retaining such data 
in the output file produced by Fitzpatrick (para. 0007,0094). 

Regarding Claim 6, please refer to the like teachings of Claim 1, 
noting that one of the synchronism options involves outputting the 
data information at a certain time after the start of the playing of 
an audio file, which reads on the concept of "mixing" (para. 0072). 
It is noted herein that the implementation of such a process, in view 
of the desirable modification proposed above, would involve performing 
such mixing in the digital domain, again, with the motivation being 
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the elimination of the requirement of components capable of real time 
processing. Such digital addition or mixing is substantially well 
known in the art, support for which can be found, for example, in 
Farhangi et al (USPN 5647008), which has been included with this 
office action. In the teachings of Fitzpatrick, the resultant signal 
is written to a new file designated as an output file (col. 3, lines 
22-24 and col. 4, lines 1-2 and 63-67). This process of writing of 
entities reads on "generating a new audio file containing audio data 
resulting from the mixing". 

Regarding Claim 7, please refer to the like teachings of Claim 2. 

Regarding Claim 8, the start reproduction time is one of the 
synchronization options, which reads on "at least a portion of the 
audio format of the descriptive information is mixed with audio at the 
beginning of the audio file" (para. 0070) . 

Regarding Claim 10, please refer to the like teachings of Claim 
5. Regarding Claim 11, please refer to the like teachings of Claim 1, 
noting that Sato discloses the text information read out program as 
being recorded on a computer readable recording medium (para. 0108) . 
Regarding Claim 12, please refer to the like teachings of Claim 2. 
Regarding Claim 15, please refer to the like teachings of Claim 5. 
Regarding Claim 16, please refer to the like teachings of Claim 1, 
noting that the program is installed on a computer system (Figure 2) 
from a readable recording medium (para. 0108). Regarding Claim 17, 
please refer to the like teachings of Claim 2. Regarding Claim 20, 
please refer to the like teachings of Claim 5. 
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(II) Claims 3-4, 9, 13-14, and 18-19 were rejected under 35 
U.S.C. 103(a) as being unpatentable over Sato in view of Fitzpatrick 
as applied above, and further in view of Yumura et al (USPN 5834670) . 
Hereafter, "Yumura et al" will simply be referred to as "Yumura". 

As detailed above, Sato discloses a system for selectively 
including information about an audio file into the audible playing of 
the audio file. Sato discloses a variety of timing at which the audio 
file information may be emitted by the speaker (14) in relation to the 
playing of the audio file. Fitzpatrick discloses the notion of 
digitally combining audible parts of an input file into a different 
file . 

However, Sato in view of Fitzpatrick does not specify: 
that the audio format of the descriptive information is 
concatenated to the beginning of the audio file 
Yumura discloses a system for audibly presenting information about a 
song and the user requesting a song in a karaoke system. The audio 
file name and requester' s name are input to a local terminal of the 
karaoke system with an input device (23). This information, processed 
by a speech synthesis unit (25) influenced by genre of the song, is 
output to the speakers during an introduction, interlude, or just 
before a song (col. 3, lines 13-35). The playing of the song 
information data reads on "at least a portion of the audio format of 
the descriptive information is concatenated to the beginning of the 
audio file". 
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To one of ordinary skill in the art at the time the invention was 
made, it would have been obvious to incorporate the emission of the 
song data before the playing of song as taught by Yumura into the 
system of Sato in view of Fitzpatrick. The motivation behind such a 
modification would have been that such an arrangement would have 
enabled users of the system to directly identify information regarding 
a song to be played before the actual playing of the song. Playing 
the song data before the actual song would have left the song to be 
heard in its original form and prevented any unpleasant sound caused 
by the overlapping of the music and synthesized voice data. 

Regarding Claim 4, the system of Yumura involves a main computer 
source which stores song information and a terminal computer source 
which requests and plays the stored music (col. 2, lines 44-67). Song 
data is transmitted from the main unit (1) and the terminal (2), and 
the synthesis of the song title and other information involves the use 
of data received in this transmission (col. 3, lines 15-18). This 
aspect of the invention, which improves the quality of the synthesized 
audio, reads on "the concatenating is performed in response to an 
operation to transfer the audio file from a first computer system to a 
second computer system". 

Regarding Claim 9, please refer to the like teachings of Claim 4. 

Regarding Claim 13, please refer to the like teachings of Claim 3. 

Regarding Claim 14, please refer to the like teachings of Claim 4. 

Regarding Claim 18, please refer to the like teachings of Claim 3. 

Regarding Claim 19, please refer to the like teachings of Claim 4. 
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(10) Response to Argument 

(I) Summary of Rejection 

Two groups of rejections were presented in the final office 
action, the first combining the teachings of Sato in view of 
Fitzpatrick and the second combining the teachings of Sato in view of 
Fitzpatrick in further view of Yumura. The first group of rejections 
demonstrated each of the independent claims (1,6,11,16) in the present 
application as obvious and thus not patentable. Both groups were 
rejected under 35 U.S.C. 103(a). Not withstanding the other basic 
criteria, the application of references under this statute sets the 
requirements for the prior art to be at teaching or suggesting all 
claim limitations as claimed, as noted in MPEP 2143. 

As applied in the above rejection, Sato delimits the basic 
concept - the synthesis and playback of text information from an audio 
file with the sound or music from the same audio file - underlying the 
present application, as claimed. Sato also includes details regarding 
the particulars of the implementation behind such a concept, though 
not all of the details of implementation as claimed. The teachings of 
Fitzpatrick, however, remedy this difference in implementation between 
the teachings of Sato and the present application as claimed. The 
teachings of Fitzpatrick also provide motivation for utilizing the 
manner of implementation denoted in Fitzpatrick, thereby establishing 
a prima facie case of obviousness. 
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Specifically, Sato discloses a system that produces an audible 
signal that includes audible information concatenated after or mixed 
with audio/music signals to which the audible information pertains, 
wherein the audible information and the audio/music signals are 
originally derived from the same audio file (abstract, para. 0009- 
0016) . Thus, as noted with more particular citations above, Sato 
teaches "A method comprising reading descriptive information about an 
audio file from meta-data for the audio file; and concatenating at 
least a portion of an audio format of the descriptive information" as 
claimed in claim 1 . Such an interpretation of Sato pertaining to the 
claim language involves interpreting the x audio format' of Sato to be 
the analog format of the 
signal output from the A/D 
converters (13b) and the 
^concatenation' to take place 
at the connection point (see 
Figure 2, reproduced at 



13 a 


14 




D/A CONVERTER 






LOUDSPEAKER | 


13b 






D/A CONVERTER 







From Figure 2 of Sato, showing connection or 'concatenation' of signals 
that are output through loudspeaker 14. 

right) where the outputs of the two D/A converters (13a-b) connect to 
create a composite output signal for the speaker (14). The timing of 
this combination of analog signals can be controlled in Sato to be 
during or after the music signal (para. 0089-0094). However, the 
claim language recites "concatenating ... an audio format ...to an audio 
file" or "mixing... an audio format... with the audio file" t which 
requires the claimed "audio format" to be digital, thus meaning that 
the claimed ^concatenation' or ^mixing' takes place between two 
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digital signals, treating the signals in the digital domain. Thus, 
while Sato teaches an analogous combination of audio signals, the 
combination of Sato is performed in the analog domain. However, 
Fitzpatrick teaches that digital audio signals, whether representing 
audio sounds or text that converted to a spoken word, can be combined 
into an audio output file (col. 3, line 57-col. 4, line 2). The 
implication of such an output file in the system of Fitzpatrick is 
that non-efficient or less-efficient hardware may perform the media 
conversion process (col. 5, lines 8-13, for example). This can be 
contrasted with the system of Sato wherein the real time, direct-to- 
speaker output requires the efficient, multi-tasking operation of the 
CPU and at least two D/A converters to extract, transfer, synthesize, 
synchronize, volume monitor, and D/A convert digital audio and related 
data (para. 0055-62 and 0070-0086, for example). In Fitzpatrick, the 
sequential writing of digitized audio data into an output file, 
specifically preceded by a header and followed by a concatenation of 
an end-of-file pointer (col. 3, lines 22-26 and col. 4, lines 1-2, 35- 
37, 50-53, and 63-65), at least suggests the claimed "concatenating... 
to the audio file" of "mixing. . with the audio file", particularly in 
view of the desire in Sato to place the synthesized information during 
or at the end of the file (para. 0089-90) . The art-appropriate 
interpretation of this "concatenating" will be further discussed 
below, with regards to a particular argument by the applicant. 
However, as summarized and detailed further above, Sato in view 
Fitzpatrick at least suggest the methods and apparatuses of the 
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present application, so far as said methods and apparatuses are 
claimed and considered as a whole. 

(II) Summary of the Applicants Arguments 

The applicant' s remarks and position fail to appreciate that the 
rejections of the final office are applied under 35 U.S.C. 103(a), and 
as such, involve at least involve what the reference collectively 
suggest. The applicant's remarks also further fail to appreciate that 
limitations in question are rejected in view of a combination of 
references, not single references alone. Per MPEP 2145, it is known 
that one cannot show nonobviousness by attacking references 
individually where the rejections are based on combinations of 
references. The applicant also appears to argue that since the 
reference of Fitzpatrick can process a broader range of input files 
(or "do more") than just audio files, that the teachings found therein 
cannot be applied to Sato. The examiner respectfully disagrees, as is 
further detailed below. 

(III) Specific Responses to Applicant's Remarks 

On page 5, lines 1-2 , the applicant has stated, "Rather than 
concatenating the synthesized voice to the audio file, Sato simply 
plays the synthesized voice through a speaker" and "Applicants 
respectfully assert that rendering of audio information from a 
synthesizer is not concatenation". The examiner respectfully notes 
however, that the output of the synthesizer is applied via a D/A 
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converter to the same signal 



14 



13a 



line as that which carries 



the music audio signal (see 



D/A CONVERTER 



LOUDSPEAKER | 



figure at right, in view of 



13b 



para. 0059, wherein the 



loudspeaker 14 converts both 



0/A CONVERTER 



analog signals) . So far as 

these analog signals are both applied though the same loudspeaker, 
this combination of signals at or before the speaker 14, particularly 
in view of the "end of piece" or "predetermined period of passing 
time" synchronization timings, at least teaches the "concatenating" of 
the synthesized text information in an audible format to the audio 
data of the audio file. The further applied teachings of Fitzpatrick 
(as well as the applicant's own specification) substantiate that this 
concatenating or mixing, as delimited and intended to be interpreted 
in the pending claim language, is not patentable over the teachings of 
Sato in view of Fitzpatrick, as applied in the final office action. 

In a related remark, on page 5, lines 9-14, the applicant has 
stated, "During prosecution, the Examiner asserted regarding the Sato 
reference that *the synchronism between the playing of the audio file 
and audio data from the synthesizer reads on the concept of 
^concatenating at least a portion of an audio format of the 
descriptive information 1 " and "The Examiner appears to have backed off 
from this position (see following paragraph), but Applicants wish to 
assert their argument against this position for the record". As 
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discussed in the above paragraph, the examiner has not "backed off" 
this position. Sato does teach this concept (emphasis added), as 
worded in the final rejection, but not this concept in the digital 
format as particularly claimed. Specifically, the phrase "to an audio 
file" has been interpreted to mandate the concatenation or mixing to 
take place in the digital domain between digital audio signals. 
However, Sato still teaches the fundamental, underlying process, even 
the steps are performed in a different format (digital instead of 
analog) . Sato, in comparison, concatenates or mixes an audio format 
(analog) of the descriptive information to or with an audio signal 
indicative of the audio file, not "to" or "with" the components of the 
initial audio file, in their digital file format, itself. 

At this point it should be noted that the pending claim language 
of "concatenating... an audio format... to the audio file" and "mixing an 
audio format ... with the audio file" is marginally ambiguous or at 
least misleading with respect to the technical arts represented by 
such a choice of language. Such language, at face value, appears to 
denote literally taking an audio file, stored in a certain space in a 
memory, and directly tacking the audibly formatted meta-data 
information to the space in series behind or before the space occupied 
(or logically occupied) by the audio file. This implication, though, 
conflicts with the applicant's own specification as well as the 
technical details of audio files. As is notoriously well known in the 
arts, files traditionally have non-audio data at the beginning and end 
of the files, respectively known as headers and end delimiters (or 
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end-of-file pointers). Fitzpatrick substantiates this position at 
least with respect to the input media files as well as the output file 
created therein (see col. 3, lines 22-26 and 54-57 and col. 4, lines 
58-65). The applicant's own specification also admits that audio 
files have meta-data stored after audio data at the end of the audio 
file (page 3, line 8-10) . Thus, literally concatenating the audio 
format of the descriptive information, which implies concatenating to 
the end or beginning of the audio file, would result in an improper 
file format. This format would be improper because the audible format 
of the descriptive information would not be found or recognized by 
existing playback programs (noted by applicant, page 4, lines 23-25) 
since it is before or after the data that indicates that the beginning 
or end of the file has been encountered (and thus, no more data for 
playback is in the file, per such end or beginning markers, even 
though the literal interpretation of the pending claim language would 
result otherwise) . The ^concatenating' of the independent claims must 
be interpreted as potentially being ^before' or A after' any file 
format, so far as claims 3 and the like further define such 
concatenating as x to the beginning' . The fact that there may be no 
auxiliary data at the beginning or end of the audio file is irrelevant 
so far as the details of the audio file format are not claimed. 
However, this literal interpretation is not what is intended by the 
applicant, as is evidenced by the applicant's own specification. See 
page 6, lines 17-19; page 7, lines 18-20 and 23-28, for example. As 
detailed in these lines, "concatenating... an audio format... to the audio 
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file" involve concatenating the audio formatted descriptive 
information (138) to the audio content (118) of the audio file (132), 
not to the audio file (132) itself (which again, would potentially 
cause the above problems of making the added data not read or found by 
a file playback program) . This form of concatenating, putting or 
writing audio formatted signals (including audio sounds and text that 
can be converted to a spoken word, such as in audio file of Sato) in 
adjacent positions, is the same form of 'concatenating' as performed 
in Fitzpatrick and, so far as human discernable (audio) formatted 
entities, including audio sound and converted text, would be written 
sequentially into the output file. This limitation in the claims has 
been interpreted in light of the technical suggestion of the 
applicant' s specification (concatenating to the audio content in an 
audio file), as is further applied in the rejection above, 
particularly with respect to the application of the Fitzpatrick 
reference . 

A similar interpretation must be -and has been- applied to the 
'mixing' limitation of Claim 6. Literally 'mixing an audio format... 
with the audio file" would inherently create a new file, since the 
original audio file would be modified by the mixed in audio format. 
However, Claim 6, recites the additional, separate step of 'generating 
a new audio file' , thus indicating that the 'mixing... with the audio 
file' is not to be interpreted literally as mixing 'with the file' , 
else the 'generating' step would be superfluous. Such literal 
'mixing' is also an incorrect interpretation of the claim language of 
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claim 6 because it suggests that the x audio format' may be mixed with 
the metadata of the audio file, which is also an improper file format 
so far as the added audible format of the descriptive information 
would not be found or recognized for playback with other audio data by 
existing playback programs (noted by applicant, page 4, lines 23-25), 
since it would be in the metadata area instead of the expected audio 
data area. Similar to the concatenating above, this ^mixing' is 
intended to be technically interpreted as mixing the audio title (138) 
with the audio content (118) of the audio file (132) (see page 7, line 
28- page 8, line 5 of the applicant's specification as originally 
filed) , not the broad and ambiguous implication of "mixing... with the 
audio file". To reiterate, reading the ^mixing' limitation of Claim 
literally would invoke enablement as well as written description 
problems as noted above. The final rejection, as well as this 
examiner' s answer, have considered this limitation in view of the 
appropriate, technical interpretation of the claim language, as is 
substantiated by the applicant's specification. 

On page 5, lines 19-20, the applicant has stated, "The Office 
Action attempts to remedy this deficiency in the prima facie showing 
of obviousness by relying on Fitzpatrick. However, such reliance must 
fail". It is respectfully noted that the context of "must fail" is 
unclear.. The applicant appears to suggest that Fitzpatrick inherently 
cannot remedy the alleged deficiencies of Sato. The logic behind this 
statement is not readily apparent, as it suggests that the teachings 
of Fitzpatrick are altogether irrelevant, regardless of what they do 
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or do not contain. Note the difference between phrase "does fail" and 
the chosen phrase of "must fail 7 '. Regardless, the teachings of 
Fitzpatrick can remedy the deficiency of Sato and do remedy this 
deficiency as discussed above and further demonstrated below with 
regards to the applicant's remaining arguments. 

On page 5, lines 25-28, the applicant has stated, "Neither Sato 
nor Fitzpatrick, alone or in combination, discloses, suggests or 
teaches 'concatenating at least a portion of an audio format of the 
descriptive information to the audio file 1 (Claims 1, 11, and 16, in 
part, emphasis added)" and "Fitzpatrick discloses instead translating 
or converting a multimedia data stream or file to an audio media". 
The examiner respectfully disagrees. Sato in view of Fitzpatrick at 
least suggests this limitation. As detailed above, the technical 
implication of "concatenating... to the audio file" is that the 
synthesized contents information data is concatenated to the audio 
music data, as noted in the applicant's specification. To infer 
otherwise would invoke enablement and written description issues, as 
discussed above. The use of different descriptive language does not 
change the underlying technical meaning of said language, again, which 
is described in the applicant's own specification. Sato discloses 
that voice-synthesized contents information can be mixed over or 
played back after the music to which it pertains. Fitzpatrick teaches 
that the playback of synthesized text information and music from the 
same file may be combined into a single file for output. The alleged 
'translating or converting' of Fitzpatrick (understood to be the 
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sequential writing of digitized audio to the output file, col. 3, line 
66 - col. 4, line 2) is the same technical process of ^concatenating' 
as present in the pending claims, as is further demonstrated by the 
applicant's own specification. These two sets of teachings, 
particularly in view of the x end of song' playback option of Sato at 
least suggest "concatenation... to the audio file" as properly 
interpreted in view of the applicant's specification. 

On page 6, lines 3-4, the applicant has stated, "The Fitzpatrick 
input file is not an audio file it may include video elements, 
graphical elements, document format control, etc.". The examiner 
respectfully notes that this statement in and of itself substantiates 
the fact that the input file of Fitzpatrick may be an audio file. 
Just as the input file may include the video, graphical, and such 
elements, it also may not, again, as evidenced even by the applicant's 
own choice of language. Regardless, the limitation was rejected by 
considering Sato in view of Fitzpatrick, not Fitzpatrick alone. The 
text and music input file in Sato is clearly an input audio file (mp3 
recording method) (para. 0065, for example). The system of 
Fitzpatrick can clearly process such a file, as evidenced by the 
flowchart shown in Figure 2 and the potential inclusion of audio sound 
and text words of phrases, col. 3, lines 41-53, in the input file. 
The x audio file' or mp3 formatted file of Sato is one form of 

^multimedia file' so far as it includes both text (visual) and audio 

(audible) types of media. 
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On page 6, lines 9-12, the applicant has stated, "They [the Final 
Office action, as reiterated in the Advisory Action] argue that, 
because one can go from beginning to end of the flowchart shown in 
Fig. 2 of Fitzpatrick without traversing blocks 320 - 380, Fitzpatrick 
discloses processing for an audio-only file such as a file having the 
music data shown in Sato". The examiner respectfully notes that this 
statement fails to note that the mp3 formatted file of Sato includes 
both audio and text data (para. 0065, for example); as such, depending 
on the applicant's intended interpretation of the phrase ^audio-only' , 
the file of Sato is more appropriately called a ^audio-and-text-only' 
file. Again, the system of Fitzpatrick would be able to handle such a 
file, for example, through two passes through the 250-270 branch in 
Figure 2, one pass for the audio data and one pass for the text data 
shown in Figure 9. As stated, above, Sato explicitly teaches an 
audio-text file (mp3), and Fitzpatrick discloses a method of 
processing that would have been applicable to such an audio file. 

On page 6, lines 9-16, the applicant has stated, "They argue 
that, because one can go from beginning to end of the flowchart shown 
in Fig. 2 of Fitzpatrick without traversing blocks 320 - 380, 
Fitzpatrick discloses processing for an audio-only file such as a file 
having the music data shown in Sato" and "However, such is not the 
case" and "One cannot traverse from beginning to end of the flowchart 
in Fig. 2 without executing block 225" and "At block 225, a counter 
regarding audio entities in the file is initialized to zero" and 
"Thus, it is anticipated that non-audio entities may be encountered in 
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the input file". The examiner respectfully notes, however, that block 
225 is not between or in the path of blocks 320-380. Thus, the 
applicant has failed to even contradict the characterization of the 
Final Office and Advisory Actions that the applicant' s own response 
has presented. This would then suggest that "such" is the case, that 
Fitzpatrick discloses processing compatible or for audio-text-only 
files including the music data file shown in Sato. 

The statement from lines 15-16, that "it is anticipated that non- 
audio entities may be encountered in the audio file" also supports the 
position reiterated herein, that the system of Fitzpatrick would have 
been compatible with the music and text data of Sato. Again, the 
applicant's response uses the word x may' , which necessitates the 
possibility of x may not' , the latter of which being applicable to the 
input of Sato. Regardless, x may' does not mean ^must' or indicate 
inherency. The data of Sato meets the definition of ^multimedia' in 
Fitzpatrick by its inclusion of both audio sound and convertible-text 
data. As such, the methods, systems, and teachings of Fitzpatrick 
would have been appropriate for and applicable to the types of data 
and processing performed in the system of Fitzpatrick. The applicant 
also appears to be suggesting that since Fitzpatrick can handle a 
broader array of data, including graphics and video than that which is 
addressed in Sato or the present application, then such teachings of 
Fitzpatrick are not applicable under 35 U.S.C. 103(a) for the lesser 
set of sound and text multimedia data present in the input of the 
system of Sato. The examiner respectfully disagrees. Fitzpatrick' s 
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ability to do more - to process more diverse multimedia files - does 
not prevent its application to the teachings of Sato for at least 
processing multimedia files with text and' audio. Together, the 
teachings of Sato in view of Fitzpatrick at least suggest an 'audio 
file' (or a file comprising 'audio content' , as further detailed above 
with regards to the discussion of 'concatenating. ..to the audio file' ) . 

On page 6, lines 23-26, the applicant has again reiterated, 
"Neither one discloses, teaches or suggests 'concatenating at least a 
portion of an audio format of the descriptive information to the audio 
file' for which meta-data has been read". The examiner respectfully 
disagrees. As stated in the above rejection, Sato discloses the 
combination of music and synthesized text information for output from 
the same file and Fitzpatrick discloses a processes for digitally 
combining the sound and synthesized text data from the same file for 
the purpose of output or playback. Together, in view of the details 
further discussed above, these references at least suggest the methods 
and apparatuses as claimed. 

On pages 7 and 8, the applicant continues to address the 
teachings of Sato and Fitzpatrick. As is clearly evident, the 
applicant's remarks continue to attack the references of individually. 
The applicant' s remarks repeatedly attack the references for 
limitations that they were not relied upon as teaching or at least 
suggesting, or not relied upon as at least suggesting alone. This is 
an ineffective way of showing nonobviousness , as is noted in MPEP 
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On page 7, lines 10-11, the applicant has stated, "Neither the 
Sato or Fitzpatrick reference teaches, discloses, or suggests 
'generating a new audio file containing audio data resulting from the 
mixing'". The examiner respectfully disagrees. Sato clearly teaches 
mixing the audio signals that represent the music and synthesized text 
information (reading-out of synthesized data during reproduction of 
music data, para. 0050-0053, which is simultaneous reproduction, hence 
'mixing' , particularly in view of combined signal line/output over 
speaker 14, Figure 2). Fitzpatrick discloses the putting together of 
such data (sound and synthesized text) into an output file (col. 3, 
lines 22-26, 66-7 and col. 4, lines 1-2 and 58-67). The combining of 
data into an output file in Fitzpatrick in view of the manner of 
combining such information (mixing) in Sato, and vice versa, at least 
suggests 'mixing an audio format of at least a portion of the 
descriptive information with the audio file' and 'generating a new 
audio file containing audio data resulting from the mixing". 

On page 7, lines 15-16, the applicant has stated, "Applicant's do 
not claim 'handling' of digital audio signals". The examiner 
respectfully disagrees. Each of the independent claims recite 
"concatenating" or "mixing" digital audio data. These are two forms 
of 'handling' digital audio so far it was used in the Advisory Action 
to refer to the teachings of Fitzpatrick. Fitzpatrick teaches the 
particular claimed * concatenating' form of digital audio data 
'handling' , in view of the serial reproduction of Sato, and Sato in 
view of Fitzpatrick is considered to at least suggest the particular 
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^mixing' form of digital audio data handling. The applicant has 
neither addressed nor refuted what the collective teachings of Sato in 
view of Fitzpatrick at least suggest, as was applied in the final 
office action. 

On page 7, lines 17-18, the applicant has stated, "As is argued 
above, Sato does not disclose, teach or suggest ^generating a new 
audio file containing audio data resulting from the mixing'". Again, 
the examiner respectfully notes that this argument is moot since Sato, 
alone, was never said to teach or suggest this limitation. Rather, 
Sato in view of Fitzpatrick was considered to at least suggest this 
limitation. Regarding this limitation, Sato is considered to teach 
^generating a new audio signal containing an audio signal resulting 
from the mixing' , while Fitzpatrick, in view of the processing and 
combination of signals in the digital domain into a new output file, 
at least suggests embodying such a new signal in an audio file. 

On page 7, line 20, the applicant has stated, "Sato does not 
generate an output file". Again, the examiner respectfully notes that 
this argument is moot since Sato, alone, was never said to teach or 
suggest this limitation. Fitzpatrick discloses the generation or 
"writing" of pieces to an output file, which at least suggests the 
claimed "a new audio file" (steps 220,260,280, Figure 2 of 
Fitzpatrick, for example) . 

On page 7, lines 21-23, the applicant has stated, "Similarly, 
Fitzpatrick also fails to disclose, teach or suggest ^generating a new 
audio file containing audio data resulting from the mixing', at least 
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because Claim 6 recites that the mixing relates to ^descriptive 
information' about the audio file". Again, the examiner respectfully 
notes that this argument is moot since Fitzpatrick, alone, was never 
said to teach or suggest this limitation. Rather, Fitzpatrick 
discloses that digitized data representing text and sound from an 
input file, and Sato teaches or at least suggests that such a text is 
descriptive information' (para. 0065) and that such information is 
combined via mixing (Figure 8 and simultaneous reproduction, bottom 
portion of Figure 1) . One cannot show nonobviousness by attacking 
references individually where the rejections are based on combinations 
of references. Accordingly, this and other such arguments by the 
applicant are unpersuasive . 

On page 7, lines 23-25, the applicant has stated, "Because Sato 
does not show the generating of an audio file, the Examiner's 
rejection must fail because, as is discussed below, Fitzpatrick does 
not show this element either". The examiner respectfully notes, 
however, that generating an audio file (and Fit zpatrick' s alleged 
failure to show this element) is not ^discussed below' in the 
applicant's brief. A conclusion is noted in line 20 of page 8, but no 
support is present in the remarks preceding this statement, which 
instead discuss additional, unapplied features of Fitzpatrick, but not 
the features for which Fitzpatrick was specifically relied upon for in 
the final rejection. The applicant's remarks do not address nor 
contradict that the output file of Fitzpatrick (col. 3, lines 22-26, 
66-7 and col. 4, lines 1-2 and 58-67), which is written during the 
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processing of the input file, at least suggests 'generating a new 
audio file containing audio data' . Accordingly, writing this 'output 
file' is understood to at least suggest the generating of the new 
audio file as claimed, in further view of the simultaneous 
reproduction (thus, mixing) teachings of Sato in regards to the manner 
in which data is combined. Again, the motivation behind combining 
such teachings would have been that such digital processing would have 
not required hardware capable of efficient processing for the real 
time production of output, as is suggested by the teachings of 
Fitzpatrick (col. 5, lines 8-13). 

On page 8, lines 3-5, the applicant has stated, "However, 
Fitzpatrick does not disclose, suggest or teach that the human 
discernible entity may be meta data that includes descriptive 
information about an audio file, as recited in Claim 6". Again, the 
examiner respectfully notes that this argument is moot since 
Fitzpatrick, alone, was never said to teach or suggest this 
limitation. Rather, Fitzpatrick discloses that an input file may 
contain a text word or phrase which can be converted to spoken word or 
phrase and Sato, was relied upon in the final rejection, more 
particularly discloses that such text with a file may convey 
descriptive information (discussion of metadata in Sato, para. 0065 
and Figure 9, as noted in regards to Claim 1, which was referenced in 
the rejection of Claim 6) . 

On page 8, lines 10-13, the applicant has stated, "Indeed, 
Fitzpatrick teaches away from the technique of Sato" and "That is, the 
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problem that Fitzpatrick seeks to solve is the loss of non-discernable 
data that the human ear cannot understand (such as video or graphics 
data) from a converted audio file. See Fitzpatrick, Col. 3, lines 57 - 
61". The examiner respectfully disagrees. The citation provided by 
the applicant from column 3 discloses types of non-discernable data, 
but not that this data is the sole or whole problem sought to be 
solved by Fitzpatrick. Again, the applicant appears to be arguing 
that since the teachings of Fitzpatrick can be applied to a broader 
array of multimedia files than those found in Sato, the teachings of 
Fitzpatrick are not applicable to the types of files handled in Sato. 
The basis of this argument is not clear, nor is it persuasive. The 
applicant has provided no evidence nor documentation to support this 
position; arguments of counsel cannot take the place of factually 
supported objective evidence. As understood by the Office, the 
additional features or capabilities of Fitzpatrick does not constitute 
"teaching away", so far as it would not lead away from the claimed 
invention. On the contrary, the purpose of the system of Fitzpatrick, 
as is stated by Fitzpatrick, is media boundary transversal, comprising 
the transformation of a multimedia file to and from an audio output 
that includes portions understandable to humans while preserving both 
the discernability of the human discernable output to the unaided 
human ear and also the integrity of the underlying data (col. 1, lines 
59-col. 2, line 5). Nothing in this purpose requires that the system 
of Fitzpatrick actually be applied to an input file with non- 
discernable data, just as blocks 320-380 are not necessarily included 
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in a path from Start to Stop in Figure 2, even if the capability to 
handle such entities is there. The system of Fitzpatrick would still 
convert the input file to and from the disclosed audio media, 
traversing the visual media boundary of the text in the input file. 
The teachings of Fitzpatrick are clearly analogous and combinable with 
those of Sato, so far as the process of Fitzpatrick would have been 
able to allow the text to converted to an audio media. The benefit 
provided by the underlying process of Fitzpatrick, so far as is 
applicable to Sato and applied in the final rejection, does not hinge 
on the inclusion or exclusion of any such non-discernable data. 

On page 8, lines 13-17, the applicant has stated, "If Fitzpatrick 
did, as the Office Action claims, teach x concatenating at least a 
portion of an audio format of the descriptive information to the audio 
file 1 such that the initial audio and metadata were contained in an 
input audio file, then no non-discernable data would be present in 
such input file, and the motivation behind the Fitzpatrick disclosure 
would be obviated". Again, the examiner respectfully notes that this 
argument is moot since Fitzpatrick, alone, was never said to teach or 
suggest this limitation. Rather, such a limitation was rejected 
taking the teachings of Sato in further consideration of Fitzpatrick, 
as is discussed at length above. Further, the allegation of "the 
motivation behind the Fitzpatrick disclosure would be obviated" is not 
pertinent to the present prosecution; the rejection is based on what 
Fitzpatrick does teach, not what Fitzpatrick may or may not have 
taught if the input file comprised only audio and text. This 
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statement by the applicant concerning the ^motivation to yet disclose' 
of Fitzpatrick amounts to mere conjecture, not supported or 
supportable by any evidence or documentation of record. The teachings 
of Fitzpatrick are applicable to Sato for at least what they do teach. 
The failure to utilize all features of the secondary reference does 
not preclude the application of the secondary reference in the 
combination for what is disclosed. The applicant appears to be 
arguing that recognizing and claiming, as part of the pending claims, 
latent properties of the system of Fitzpatrick (specifically, the 
ability to convert audio-text files to audio media) are unobvious 
because the reference of Fitzpatrick includes additional capabilities. 
However, it is well established that mere recognition of latent 
properties in the prior art does not render nonobvious an otherwise 
known invention. The applicant's claiming of less than the full 
capability of Fitzpatrick (in view of and as applied to Sato) does not 
distinguish or make such claims patentable, nor prevent capabilities 
of Fitzpatrick from being considered in view of their counterparts of 
Sato . 

On page 8, lines 17-19, the applicant has stated, "combining 
Fitzpatrick with Sato would obviate the need for Fitzpatrick, because 
the Sato file would not include non-discernable data". The examiner 
respectfully disagrees. As discussed above, at least part of the 
teachings of Fitzpatrick are applicable and can be motivated into 
analogous aspects of Sato. Arguing additional properties of 
Fitzpatrick that are not needed (or, alternatively, may be included, 
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but would not ever be utilized in formal processing) does not preclude 
other aspects of Fitzpatrick from being applied to analogous aspects 
of the teachings of Sato. The motivation provided in Fitzpatrick is 
at least applicable to the teachings that are relied upon in 
Fitzpatrick . 

On page 8, lines 20-21, the applicant has stated, "Thus, 
Fitzpatrick does not read on ^generating a new audio file . 
The examiner respectfully disagrees. The applicant's remarks do not 
address nor contradict that the output file of Fitzpatrick (col. 3, 
lines 22-26, 66-7 and col. 4, lines 1-2 and 58-67), which is written 
during the processing of the input file, at least suggests ^generating 
a new audio file containing audio data'. The applicant's remarks, 
instead, address aspects of Fitzpatrick which were not relied upon for 
in the final rejection. As noted above, this argument is spurious 
with regards to the manner in which Fitzpatrick has been applied. 
Accordingly, writing the ^output file' in Fitzpatrick is understood to 
at least suggest the generating of the new audio file as claimed, in 
further view of the simultaneous reproduction (thus, mixing) teachings 
of Sato in regards to the manner in which data is combined. 

On page 8, lines 20-25, the applicant has stated, [HJandling of 
digital audio signals' as asserted in the Advisory Action does not 
teach, suggest, or disclose the limitation of ^generating a new audio 
file containing audio data resulting from the mixing' " and "This is 
true at least for the reason that the claim limitations of Claim 6 
make it clear that the claimed mixing is performed in relation to 



Application/Control Number: 09/752,611 Page 31 

Art Unit: 2644 

'descriptive information 1 mixed with the initial 'audio file 1 ". 
Again, the examiner respectfully notes that this argument is moot 
since Fitzpatrick, alone, was never said to teach or suggest this 
limitation. Rather, such a limitation was rejected taking the 
teachings of Sato in further consideration of Fitzpatrick, as is 
discussed at length above. Sato is particularly relied on for 
suggesting the 'mixing' type of data combining or handling, as well as 
the fact that such mixing is performed in relation to 'descriptive 
information' and the initial audio file (which again, as discussed 
above, is properly interpreted as the audio contents of an audio file, 
not the literal whole audio file) . 

On page 8, lines 25-28, the applicant has stated, "Even if 
Applicants were to concede to the Advisory Action's characterization 
that Sato discloses mixing an audio signal, the Advisory Action is 
fatally flawed because is fails to make a prima facie showing of 
'generating a new audio file containing audio data resulting from the 
mixing'". While the applicant is entitled to this, their own 
opinion, the examiner respectfully disagrees. Sato in further view 
Fitzpatrick at least suggests this limitation. Sato discloses 
generating a new analog audio signal resulting from mixing. 
Fitzpatrick discloses generating a new digital audio file (and thus, 
signal) resulting from the combination of audio sound and synthesized 
text digital signals. Applying the particular type of signal 
combination or handling of Sato (the mixing) with the format or manner 
of combining audio signals in Fitzpatrick (through the generation of a 
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new file for output) at least suggests the limitation of ^generating a 
new audio file containing audio data resulting from the mixing' . 
Again, Fitzpatrick provides the motivation for utilizing the digital 
format of such data combining, that hardware capable of efficient, 
real time processing would not be necessary for this digital style of 
data handling, as opposed to the simultaneous, real-time handling and 
output of signals in the system of Sato. It is further noted that the 
digital handling of signals in Fitzpatrick only requires one D/A 
converter for output (col. 4, line 67-col. 5, line 1), as opposed to 
the two necessitated for the system of Sato (para. 0055). 

On page 9, the applicant notes the other claims in the 
application including the dependent claims, though no other further 
arguments, beyond those which are addressed above, are presented. 
Accordingly, so far has the applicant's arguments have been addressed 
above, it is respectfully submitted that such responses also suffice 
to substantiate the rejection of these other and dependent claims as 
well, so far as no further arguments or alleged discrepancies are 
presented by the applicant, nor believed to be present by the 
examiner. As such, the rejections in further view of Yumura are also 
appropriate and properly presented in the final office action. 

As set forth above, the applicant's attacks against the 
references individually, as well as the arguments regarding latent 
properties, are unpersuasive . 
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(11) Related Proceeding (s) Appendix 

No decision rendered by a court or the Board is identified by the 
examiner in the Related Appeals and Interferences section of this 
examiner' s answer. 

For the above reasons, it is believed that the rejections should 
be sustained. 

Respectfully submitted, 
Andrew Graham 
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